Subspace Similarity Search: Efficient k-NN Queries in Arbitrary Subspaces

نویسندگان

  • Thomas Bernecker
  • Tobias Emrich
  • Franz Graf
  • Hans-Peter Kriegel
  • Peer Kröger
  • Matthias Renz
  • Erich Schubert
  • Arthur Zimek
چکیده

There are abundant scenarios for applications of similarity search in databases where the similarity of objects is defined for a subset of attributes, i.e., in a subspace, only. While much research has been done in efficient support of single column similarity queries or of similarity queries in the full space, scarcely any support of similarity search in subspaces has been provided so far. The three existing approaches are variations of the sequential scan. Here, we propose the first index-based solution to subspace similarity search in arbitrary subspaces.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Search in Arbitrary Subspaces Under Lp-Norm

Similarity search has been widely used in many applications such as information retrieval, image data analysis, and time-series matching. Specifically, a similarity query retrieves all data objects in a data set that are similar to a given query object. Previous work on similarity search usually consider the search problem in the full space. In this paper, however, we propose a novel problem, s...

متن کامل

Dynamic High Dimensional Data Mapping for Efficient Similarity Query Processing

For efficient processing of similarity queries, the search space is often reduced by pruning inactive query subspaces which do not contain any query results so only those active query subspaces which may contain query results are examined. Among the active query subspaces, however, not all of them contain query results; an active query subspace that later turns out to contain no query results a...

متن کامل

Subspace Nearest Neighbor Search - Problem Statement, Approaches, and Discussion - Position Paper

Computing the similarity between objects is a central task for many applications in the field of information retrieval and data mining. For finding k-nearest neighbors, typically a ranking is computed based on a predetermined set of data dimensions and a distance function, constant over all possible queries. However, many high-dimensional feature spaces contain a large number of dimensions, man...

متن کامل

Advanced indexing and query processing for multidimensional databases

Many new applications, such as multimedia databases, employ the so-called feature transformation which transforms important features or properties of data objects into high-dimensional points. Searching for ’similar ’ or ’nondominated ’ objects based on these features is thus a search of points in this feature space. To support efficient query processing in these high dimensional databases, hig...

متن کامل

Isotropic Constant Dimension Subspace Codes

 In network code setting, a constant dimension code is a set of k-dimensional subspaces of F nq . If F_q n is a nondegenerated symlectic vector space with bilinear form f, an isotropic subspace U of F n q is a subspace that for all x, y ∈ U, f(x, y) = 0. We introduce isotropic subspace codes simply as a set of isotropic subspaces and show how the isotropic property use in decoding process, then...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010